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Abstract 

The correspondence between the cross-entropy method and the zero- variance 
approximation to simulate a rare event problem in Markov chains is shown. This 
leads to a sufficient condition that the cross-entropy estimator is asymptotically 
optimal. 

1 Model and problem 

We deal with a discrete-time Markov chain (X(t))%B. on a finite but large state space 
X, and with a matrix of transition probabilities P = {p{x,y)) x ,y£X, he., p(x,y) = 
¥(X(t + l) = y\X(t) = x). The state space is partitioned into three sets: Q is a small 
set of 'good' states, J 7 is a small set of 'failed' or 'bad' states, and the other states 
form the large set of 'internal' states T = X \ (Q U J 7 ). For each internal state x € T 
let j(x) be the probability that the Markov chain will hit the failure set before the 
good set when the chain starts in state x. Or, more formally, define 

T = inf{i > : X(t) GgUJ}, 

then 7(x) = P(X(T) € J-\X(0) = x). For ease of notation we assume that the 
good set consists of a single state, denoted by 0, and that the Markov chain jumps 
immediately out of it (but not immediately to a bad state), i.e., p(0, 0) = and 
p(0, J-) = 0. We are interested in the probability that when the chain starts in the 
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good state 0, it will hit the failure set J- before returning to 0. The associated event 
is denoted by A = 1{X(T) € F}. 

In this paper we consider the problem of estimating the hitting probability P(^4) 
by simulation. The typical applications that we have in mind, are system failures in 
models of highly reliable Markovian system, and excessive backlogs in product-form 
Jackson queueing networks. The failure set is a rare event that should occur only 
with very small probability, which makes the need for reducing the simulation vari- 
ance, for instance by importance sampling. Reliability and queueing systems have 
been studied largely in relation to importance sampling estimation of the perfor- 
mance measures, we refer to overviews in [9] and in [10] . 

The rare event probability ¥(A) will be estimated by an importance sampling 
simulation method that implements a change of measure P*, i.e., the estimator Y is 
the average of i.i.d. replications of 

UA\— 

where dF/dF* is the likelihood ratio, and where we assume that l{A}dP is absolute 
continuous w.r.t. d¥*. The issue is to find a good change of measure in terms of 
performance of the estimator Y. The optimal change of measure would be 

P°Pt(.) =P(-L4), 

for which Y would have zero variance [10] . It is shown in [5j |8] that the associated 
Markov chain has transition probabilities 

p^(x,y)= P (x,y)^\, xeT,yeX, (l) 

and p opt (x,y) = p(x,y) for good state x = 0, and bad states x £ T. The opti- 
mal transition probabilities cannot be used directly for simulation since they require 
knowledge of the unknown hitting probabilities, however they suggest to construct 
an importance sampling algorithm by approximating or estimating the hitting prob- 
abilities 7(x). For instance, let H(x) be the set of all sample paths of the Markov 
chain of the form n = (x = xq — > xi — > ■ — > x^) with x , . . . , x^-i € T, Xk £ T and 

p(xj,Xj + i) > (j = 0, . . . , k — 1), and where k = 1, 2, The probability of path 

7r to occur is p(ir) = YljZo P( x j> x j+i)> anci , clearly, the hitting probability becomes 
7( x ) = Z^7rgn(a:) Pi 71 )- I n [5] it is studied how the approximation 

1 a PP(:r) = max p(ir) 
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performs in reliability problems. In a slightly different context, [6l Section 4] con- 
siders the rare event probability that a random walk reaches high levels. It is shown 
that it fits in the Markov chain framework and an approximation of r y(x) is proposed 
based on an asymptotic approximation of these probabilities. 

Another line of research has been developed in |3j H] for rare event problems 
in which the probability of interest can be approximated via large deviations. In 
this framework, the decay rate is given in terms of a variational problem, or an 
optimal control problem. These problems are related to a family of nonlinear partial 
differential equations known as Hamilton- Jacobi-Bellmann (HJB) equations. It is 
shown how subsolutions of the HJB equations associated with rare event problems 
could be used to construct efficient importance sampling schemes. This method has 
been successfully applied to several queueing systems, see 011]. 

The cross-entropy method for rare event simulation [p2] considers to choose a 
change of measure P ce from a specified family of changes of measures that minimizes 
the Kullback-Leibler distance from the optimal It has been shown by pQ that 

the associated transition probabilities p ce (x,y) are of the form 

E[l{A}N(x,y)\X(0)=O] 
P M) E[l{A}Y, zeX N( X ,z)\X(0)=oy { > 

where N(x, y) is the number of times that transition (x, y) occurs in a random 
sample path of the Markov chain until absorption in one of the good or bad states. 
Again, these transition probabilities cannot be used directly for simulation since they 
contain the unknown variables N(x, y), however, in this case they suggest to estimate 
the expectations in expression , for instance by simulation. This approach has 
been applied to queueing systems in [H [2] , and to reliability systems in [11] . 

In the following sections we shall give more background on the cross-entropy 
method and how it results in the expression ([2]). More importantly, we shall show 
that in fact the cross-entropy solution is the zero- variance distribution, i.e., the 
matrices of transition probabilities satisfy P ce = P opt . This identity will be the 
basis to formulate in Section 3 a sufficient condition for which the estimated cross- 
entropy solution is asymptotically optimal. 
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2 Correspondence cross-entropy and zero-variance 



Let (Jl,A, P) be the probability space of the sample paths of the Markov chain 
X(0),X(1), . . . , and denote by X a random sample path. Recall that our objective is 
to execute simulations of the Markov chain for estimating the rare event probability 
P(^4) = P(X(T) € J 7 ), that we execute these simulations under a change of measure 
P*, and that the optimal change of measure is 

dP° pt = ^4rdP. (3) 

Suppose that we consider only changes of measures P* under which the Markov 
property is retained, say with matrix of transition probabilities P* , and suppose 
that we minimize the Kullback-Leibler distance between these changes of measures 
and the optimal one, i.e., 

inf £>(P opt ,P*), 

P*£V 



where the cross-entropy is defined by 



£>flP° p \P*) = E opt 



jpopt 



E 



(4) 



dF y ' ° V dF* 

Substituting ([3]), minimizing the cross-entropy, and deleting constant terms yields 

sup E[l{A}logdP*(X)]. (5) 

p*ev 

Since the sample path probability dP*(X) is a product of individual transition 
probabilities, we get 

T 

d¥*(X) = l[p*(X(t-l),X(t))= H p*(x,y) N ^\ (6) 

t=l (x,y)EXxX 

where N(x, y) is the number of times transition (x, y) occurring in the random 
sample path X. Substituting the expression ([6]) into the cross-entropy optimization 
program ([S}, and applying the first order condition using a Lagrange multiplier, 
gives the solution (|2|) for the individual transition probabilities. 

Notice that the optimal transition matrix P ov,Xl is a feasible matrix, i.e., an el- 
ement of V, and thus it must hold that it is the cross-entropy solution. We shall 
give a direct proof of the matrix identity P cc = P°P t 7 based on the expressions of 
the transition probabilities. In fact we shall prove a relation between the expected 
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number of transitions from x to y and the absorption probability j(y). Denote by 
v(x) the expected number of visits to state x starting at x before absorption: 



oo 



V 



(x)=E ^l{X(t)=x}X(0) 



= x 



u=o 



Proposition 1. For all x,y £ X: 



E[l{A}N(x,y)\X(0) 



x] = v(x)p(x,y)j(y). 



Proof. For ease of notation we assume that we have the equivalent modelling in 
which all the good and bad states are absorbing. Introduce probabilities (for any 
x,y e X) 

f(x,y) = F({X(t)) reaches state y\X(0) = x) 

g(y) = F((X(t)) reaches bad set T without a transition x — > y\X(0) = y). 

Notice that we allow x and y to be a good or bad state for which these probabilities 
are obviously either zero or one. Consider the event 



for n > 1. This event can only occur if (A) n — 1 times (i) a number of times 
[transition x — > y' ^ y followed by a return to x], followed by (ii) [transition x — > y 
followed by a return to x]; then (A) is followed by (B) which is (iii) a number of 
times [transition x — >■ y' ^ y followed by a return to x], followed by (iv) [transition 
x — > y followed by reaching the bad set without the transition x — > y\. That is, 



{N(x,y) = n} n {reach bad set from x}, 



E[l{A}N(x,y)\X(0) = x] 



n-l 



= IZ n ( [£ ( ^2p( x ^y')f(.y'^ x )) P{z,y)f(y,x\ 

n=l V fc=0 y'^y ^ ' ' 



X 




(iii) 
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Let us work out the summations using geometric series and denoting 

oo , 

a = ^P(x,y')f(y',x); ft = ^ ( ^ p(x, y')f(y', x)j . 
V'+V k=0 y'^y 

Hence, 

E[l{A}N(x,y)\X(0) = x] = _J_ ^-L-p^j,)^). (7) 
In the same manner we determine the absorption probability 

7 ( y ) = F((X(t)) reaches the bad set \X(0) = y). 
Partition this event with respect to the number of transitions x — > y: 

oo 

7 (y) = ^F(^;7V(x,y) = n|X(0) = y) 

n=0 

oo 

= 3(2/) + Yl f(y> x ) p ^' Ar ( x ' f ) = n i x (°) = x ) 

n=l 

= g(y) + ( ^ x ^[^2\^2 p ( x > y')f(y'> x )) ^( x ' y)f(y> x ) I 

n=l V fc=0 y'^y ) 

00 , 

x ( ^2p(x,y')f(y',x)\ p(x,y)g(y) 

k=0 y'^y 

= g{y) + 1 [X) ( Y2p( x >y , )f(y r > x )) p( x ^y)f(y^ x )\ a(y)- 

n=l V fc=0 j/V^ / 
When we include the first term g{y) as the zero-th term of the summation, we obtain 

7(f) = T^ (y) ' (8) 
From the expressions ([7]) and ([8]) we see that 

E[l{A}N(x,y)\X{0) = x] = ( — !— p(x,yh(y). 



1-/3 1-a 

To conclude, we calculate the proportionality factor: 
11 11 



1 - (3 1 - a 1 p( x >v)f(v^) 1 - rv 

1— a 

1 



1 - a - p(x,y)f(y,x) 1 - 52 y ^ y p(x,y')f(y',x) - p(x,y)f(y,x) 

1 1 ( \ 

v[x). 



1 -T,yP( x >y)f(y> x ) 1 ~f( x , x ) 

The last equality is a well-known relation for Markov chains, e.g., see [7]. □ 
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Corollary 2. For all x,y G X: 

p cc (x,y)=p opt (x,y). 
Proof. The identity follows easily by noting that 

E[l{A}N(x,y)\X(0) = 0] = f(0,x)E[l{A}N(x,y)\X(0) = x]. 

□ 

3 Asymptotic optimality 

In this section we assume that there is a family of rare events {A n } parameterized 
by n = 1,2,... such that each A n satisfies the model assumptions of the previous 
section, and such that lim n _>. 00 ¥(A n ) = 0. Suppose that Y* is an unbiased estimator 
of ¥(A n ) obtained by a change of measure P*. Then this estimator is asymptotically 
optimal if 

n->oo \og¥(A n ) 

see for instance [9]. Now, recall the cross-entropy representation p ce (x,y) in (J2]) of 
the zero-variance transition probabilities, and suppose that these are estimated by 
p ce (x,y). A common approach is to apply an iterative scheme to the optimization 
program (0). Since the program involves the rare event, we first apply a change of 
measure: 

dF 



E[1{X(T) G .F}logdP*(X)] = E (0) 



dP(°) 



1{X(T) G 7"}logdP*(X^ 



(9) 



This is done for a probability measure P*- - 1 such that (i) the Markov property is 
retained; (ii) the associated matrix of transition probabilities is feasible G V; 
(iii) the set T is 'not so' rare under p(°). The program Q is solved iteratively by 
estimation: let . . . , be i.i.d. sample paths of the Markov chain generated 

by simulating the states according to a matrix of transition probabilities P^ until 
absorption in the good or bad states, then we calculate for j = 0, 1, . . . 

h 

p(j+i) = max J^(x«)l{X«(T) G J-}logdP*(X«). (10) 

i=l 

We repeat this 'updating' of the change of measure a few times until the difference 
pO'+i) _ p(j) i s sma n 

enough (in some matrix norm). For details we refer to |12j . 
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In this way we obtain an implementable change of measure P ce , and its associated 
importance sampling estimator is denoted by 

dF 



dl 



(X)l{A n }. 
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Clearly, this estimator is unbiased, i.e., E^fY^ ] = P(^4 n j*|. We claim that it is 
asymptotically optimal if the following condition holds. 

Condition 1. There are finite posivite constants Ki,K 2 such that for all n 

Ki < X>(P opt ,P cc ) < K 2 . 

This is a condition on the approximation of the zero-variance measure by the im- 
plementation of cross-entropy method. Actually, a weaker condition for the upper 
bound suffices: V(F°v\F ce ) = o(logP(A n )) for n ->• oo. 



Theorem 1. Assume Condition^ Then the cross-entropy importance sampling 
estimator is asymptotically optimal. 



Proof. Notice that X'(P°P t ,P ce ) = E°P t [log dF opt / dF ce {X)\ > 0. Because log F(A r 
— oo, the upper bound in Condition Q] ensures that 



E opt 



lim 

n— >oo 



i og c^(x; 

° -7TOCC V J 



logP(A 



0. 



Furthermore, the lower bound gives 

lim sup 



logE opt 


_ dP ce ^ 


E°Pt 


log 


dP co ^ ' _ 



< oo. 



Now consider, (using ([3]) in the third equality in the following lines), 

2" 



cc\2 



E c 



E c 



dF 
dP ce 

dF 
dP°P* 

2 t&cc 



X)l{A n } 
(X)l{A n ) 



F(A n yE 



dF o P t 



dF c 



(X' 



~^—{x) 

dF ce 

= P(A n ) 2 E opt 



dpop t 



(X) 



x We denote the expectation w.r.t. measure P by E, w.r.t. measure P cc by E cc , w.r.t. measure 
P opt by E opt , etc. 
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So, we can conclude 

i og E-[(y n ce ) 2 ] _ i°g(n^)) 2 + iogie°pt 



lo g P(A.) 

= 2 + 



log¥(A n ) 



6 d pcc V ! 



with 



l ogE opt dF^ {X) 

lim : — ; . 

rwoo \ogf{A n ) 



logP(^n) 



lim 



logE opt 


" d popt 

. dF cc 


(X) 


E opt 


[l 0g rfP^(X)l 

to d pcc \ ) 


E opt 


log 


d popt 

dP ca 


(X) 


logP(A n ) 



= 0. 



□ 



4 A numerical example 

We illustrate the theorem by the simple example of simulating the M/M/l queue 
(Poisson-A arrivals, exponential-// services) where A < \i. We consider its associated 
discrete-time Markov chain (X(t))^ by embedding the continuous-time queueing 
process at the jump times. The transition probabilities are 

p = p(x, x + 1) = — ^— (a: = 0,1,...) 

A + /i 

q = p(x, x - 1) = —y— (x = 1,2,...). 

A + /i 

The rare event is hitting state n before returning to the zero state. For this model 
the optimal (zero- variance) transition probabilities follow easily from calculating the 
hitting probabilities 

y(x) = mX(t)) reaches n before 0|X(0) = x), 

for x = 1, . . . , n — 1, by solving the equations 

7(2) = p(x, x - 1)7(0; - 1) + p(x, x + 1)7(2: + 1), 

with boundary conditions 7(0) = and j(n) = 1. Let a = ///A. Then we get 

1 — cr x 1 — cr a;+1 1 — a x ~ l 

^( x ) = ~, p opt (x,x + 1) =p— — ; p opt (x,x- 1) = g- — . 

1 — a n 1 — a x 1 — (7 

Notice that ^(1,2) = 1. 
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The cross-entropy between the optimal probability measure P opt and any other 
probability measure Q which is associated with transition probabilities q(x, x + 1) 
and q(x, x — 1), can be determined as follows (where we apply the product form (|6|)): 



V 



jgopt 



cflpopt 

log— TR-W 



jgopt 



n iQ g 

(a;,y)6«YxAf 



p 0pt (z,2/) 

q(x,y) 



N(x,y) 



X(0) = 



log 



p opt (s,y) 



E° pt [iV(x,y)LY(0)=0]. 



{x,y)eXxX 

We may follow the same reasoning as in the proofs of Propositions Q] and [2] for 
calculating E^Nix, y)\X(0) = 0]. Notice that under P ?* = 1, /(0,a;) = 1, 

and 7(y) = 1, thus 

E°P t [iV(x,y)|X(0) = 0] = rP t (0,x)^P t (x)p°P t (x,y)7 opt (y) = v op \x)p opt (x,y). 

Finally, the visiting numbers v opt (x) = 1/(1 — / opt (x,x)) can be calculated numeri- 
cally via recursion (using f opt (x,x + 1) = 1): 

/ opt (x, x) = p opt (x, x-l) + p opt (x, x + l)/ opt (x + 1, x) 

f opt (x , lx) = p opt (x + l,x) 

J y 1 ' l-p°P t (x + l,x + 2)/°P t (x + 2,x + l)' 

We have implemented the cross-entropy method of Section [3] for queueing pa- 
rameters A = 0.8 and fi = 1. The rare-event state n was increased from n = 10 
until n = 250. The cross-entropy updating rule (jlOp was iterated ten times, start- 
ing from the uniform transition probabilities (the sample sizes k were increased 
proportionally to n). After the ten updating iterations we estimated the rare 
event probability ¥(A n ) with sample size 1000 and collected the estimated rela- 



tive errors (RE) y Var [Y£ e ] /E ce [Y£ e ] of the associated estimators, and the esti- 
mated ratios (RAT) logE cc [(y„ cc ) 2 ]/ log E cc [y„ ce ]. The figures below illustrate that 
P(P°P t ,P ce )/P(A n ) -> as n -)• oo, and that RAT is close to two, meaning that 
the estimator is asymptotically optimal. The last figure with the relative errors RE 
indicates even strong efficiency (bounded RE). 
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Figure 1. £>(P opt , ¥ cc )/¥(A n ). 
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Figure 2. Estimated ratio RAT. 
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Figure 3. Estimated relative error RE. 



5 Conclusion 



After having constructed an importance sampling algorithm for rare-event simula- 
tion, a key issue is to assess the statistical performance of the associated impor- 
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tance sampling estimator. In this paper we have considered rare-event problems in 
Markov chains, for which we have used the cross-entropy method as the engine of 
finding a change of measure for executing the importance sampling simulations. We 
have shown that for these problems the cross-entropy method coincides with the 
zero-variance approach. This non-implementable optimal change of measure is esti- 
mated by an implementable change of measure that is returned by the cross-entropy 
method. Our main result is that we give a sufficient condition for the associated im- 
portance sampling estimator to be logarithmically efficient. Further investigations 
are undertaken to obtain conditions for strong efficiency. 
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